Cancer Discovery — Latest Matching Preprints

1

From amplicon to antigen: a quantified transmission map that nominates multi-antigen antibody-drug-conjugate co-target sets across cancer types

Lam, J. M.; Walker-Samuel, S.; Pennycuick, A.

2026-07-16 oncology 10.64898/2026.07.13.26357987 medRxiv

Top 1%

1.5%

Show abstract

Somatic copy-number amplification is pervasive in cancer, and the genes it carries are candidate drug targets - but only those whose amplification is transmitted to accessible surface protein can be reached by an antibody-drug conjugate (ADC). We build an integrated map of copy-number-to-protein transmission across six tumour types and ask, for every amplified gene, whether its dosage reaches the surface. Copy number transmits to mRNA (median per-gene r = 0.21) but is attenuated at the protein level in 85% of genes, and the mRNA ranking is largely preserved to protein (rho = 0.70); the ranking is set principally at the chromatin/transcription step - among directly measured regulatory inputs, promoter DNA methylation and tumour chromatin accessibility each explain about an order of magnitude more of the transmission variance than gene structure, and do so complementarily. Critically, transmissibility is a stable, gene-intrinsic property: it is predictable from gene properties alone, with no proteomic input, at a leave-gene-out rank correlation of 0.52 (R2 = 0.29); it is not positional (holding out whole chromosome arms changes accuracy by 0.001); and it transfers across lineages (Kendall W = 0.97 across leave-one-lineage-out refits). This licenses a predictor that nominates surface targets in cancer types that lack a tissue-referenced proteome, combining direct protein measurement where it is available with prediction where it is not. Requiring co-elevation on a recurrent amplicon with measured transmissibility and an accessible extracellular ectodomain nominates 22 surface antigens on 18 distinct recurrent amplicons across four cancer types (renal, endometrial and both lung subtypes) - for example ITGB8+TSPAN13+TTYH3 on lung 7p, NCSTN+HSD17B7+MPZL1 on 1q (recurrent in several types), the transferrin receptor TFRC on squamous 3q, and FZD1 on clear-cell renal 7q; 21 of the 22 are non-driver passengers and 10 are confirmed on the experimental Cell Surface Protein Atlas. In single malignant cells, against a null that controls for per-cell sequencing depth, the co-detected constructs sit at a modest 1.05-1.45x above independence (p < 0.001, donor-block bootstrap intervals clear of 1.0), and at binding-relevant thresholds the normal-tissue co-expression collapses - so an avidity AND-gate that binds stably only where the antigens co-occur would spare normal cells that carry only one. Observed transmissibility itself transfers strongly between the two lung subtypes ({rho} = 0.88) and remains positive across distant lineages, consistent with the shared cell-of-origin regulation the map implies. Single-cell co-detection is demonstrated wherever a malignant single-cell atlas exists (both lung subtypes and glioblastoma - the latter entirely from prediction, using no GBM surface-abundance measurement); the remaining cohorts are nominated on the same genetic and topological evidence. The result is a pan-cancer, confidence-tiered catalogue of multi-antigen ADC co-target sets with a concrete plan to test them.

2

Mechanism of response to FHD-286 and decitabine combination in patients with advanced myeloid malignancies

Collins, M. P.; Lahr, D. L.; Topal, S.; Khalil, A.; Hickman, D.; Spidale, N.; Pandit, N.; Reilly, S.; Lyons, K.; Horrigan, K.; Zhao, T.; Batonga, J.; Bosinger, M.; D'Aco, K.; Ball, B.; Kishtagari, A.; DiNardo, C. D.; Stein, E. M.; Quintas-Cardama, A.; Smolen, G. A.

2026-07-20 oncology 10.64898/2026.07.17.26358055 medRxiv

Top 2%

1.0%

Show abstract

Impaired cellular differentiation is a defining characteristic of myeloid malignancies and remains a major therapeutic challenge. The BRG1/Brahma-associated factor (BAF) chromatin remodeling complex, through the ATPases SMARCA4 and SMARCA2, maintains the stemness of leukemic blasts and thus represents a promising target for novel differentiation-based therapies. In a phase 1 study in advanced myeloid malignancies, the first-in-class dual SMARCA4/2 inhibitor FHD-286 combined with decitabine (DAC) was tolerated and produced an objective response rate of 12.8% (6/47) compared with no responses with FHD-286 monotherapy. To understand the basis of this activity, we integrated high-dimensional flow cytometry and single-cell genomic analyses of longitudinal bone marrow samples from responders and nonresponders. While FHD-286 monotherapy was predominantly associated with myeloid differentiation, responders to FHD-286+DAC combination therapy exhibited a range of myeloid and erythroid differentiation trajectories. FHD-286 potentiated the transcriptional impact of DAC, driving tumor clones to fully differentiate out of the immunophenotypically and transcriptionally defined blast compartment. Responders had a baseline transcriptional profile similar to that of CEBPA-mutant acute myeloid leukemia and showed further downregulation of CEBPA upon treatment. These findings reinforce tumor cell differentiation as a mechanism of response to pharmacologic SMARCA4/2 inhibition and support further evaluation of FHD-286+DAC in molecularly defined patient subsets.

3

Single-cell gene programs define subtype identity and metastatic trajectories in renal cell carcinoma

Madrigal, A.; Kim, M.; Mehrjoo, Z.; Nishimura, T.; Saatci, O.; Osakwe, A.; Zavacky, E.; Moslemi, E.; Glennon, K. I.; Dankner, M.; Maritan, S. M.; Kuasne, H.; Pilon, V.; Monast, A.; Soytas, M.; Arseneault, M.; Oikonomopoulos, S.; Harutyunyan, A.; Lu, T.; Rayes, R.; Soto, L. M.; Hernandez-Corchado, A.; Spicer, J. D.; Petrecca, K.; Siegel, P.; Park, M.; Ragoussis, J.; Sahin, O.; Brimo, F.; Tanguay, S.; Riazalhosseini, Y.; Najafabadi, H. S.

2026-07-16 genetic and genomic medicine 10.64898/2026.07.14.26357682 medRxiv

Top 2%

0.8%

Show abstract

While extensive cellular heterogeneity in renal cell carcinomas (RCC) is linked to diverse clinical outcomes, our understanding of this diversity is limited to those driven by clonal patterns or activity of canonical pathways. Here, we present a compendium of over 85,000 single-cell gene expression profiles from primary and metastatic tumors as well as patient-derived models across four RCC subtypes, including the rare clear cell papillary renal cell tumors, which we show are often misclassified and for which we identify CASP14 as a highly sensitive and specific biomarker. We dissect malignant cell variation within and across tumors using a generative modeling framework that accounts for clonal and copy number-driven expression shifts, defining 59 gene expression programs that deconstruct canonical pathways into functional submodules with divergent activity patterns, distinct regulators, and differential association with clinical outcomes. Despite the canonical view that VHL-deficient clear cell RCC exists in a constitutive pseudohypoxic state, we show strong intra-tumor variability of a hypoxia inducible factor 2 (HIF2)-driven program linked to poor outcome. We also identify early, spatially organized activation of a complete epithelial-to-mesenchymal transition (EMT) program, loss of epithelial identity, and upregulation of protein translation programs as key characteristics of metastatic progression. Finally, a metastatic signature capturing cellular de-differentiation and translational activity identifies primary tumors associated with adverse clinical outcomes. Together, this resource establishes a framework for dissecting malignant cell heterogeneity, refines RCC subtype classification, and defines transcriptional programs underlying metastasis progression.

4

Immune organization defines adaptive immune competence and clinical outcome in breast cancer

Sanfeliu, E.; Segui, E.; Martinez-Romero, A.; Albarran-Fernandez, V.; Pascual, T.; Marin, M.; Martinez-Saez, O.; Gomez-Bravo, R.; Garcia-Fructuoso, I.; Rodriguez-Hernandez, A.; Walbaum, B.; Galvan, P.; Angelats, L.; Rubio-Perez, C.; Saura, C.; Oliveira, M.; Ciruelos, E.; Manso, L.; Pernas, S.; Vidal, M.; Waks, A. G.; Tolaney, S. M.; Pare, L.; Parker, J. S.; Villagrasa, P.; Ferrero-Cafiero, J. M.; Perou, C. M.; Campo, E.; Tabernero, J.; Braso-Maristany, F.; Prat, A.

2026-07-20 oncology 10.64898/2026.07.17.26358324 medRxiv

Top 3%

0.4%

Show abstract

Tumor-infiltrating lymphocytes (TILs) are widely used to assess antitumor immunity in breast cancer but may not reflect the functional competence of adaptive immune responses. We show that immune organization, reflected by tertiary lymphoid structures (TLS) and coordinated humoral and cellular immune programs, represents a distinct dimension of tumor immunity beyond lymphocyte abundance. By integrating histologic, transcriptomic, spatial, and immune receptor profiling analyses across multiple breast cancer cohorts, we show that immune organization is associated with greater immune repertoire diversity, evidence of therapy-induced clonal selection, and improved clinical outcomes, independent of immune infiltration. Transcriptomic measures of immune organization retained independent prognostic value across external cohorts, whereas measures of immune infiltration did not. Furthermore, treatment-induced increases in immune organization, but not immune infiltration, were associated with therapeutic response. These findings identify immune organization as a dynamic and clinically measurable state of adaptive antitumor immunity with implications for prognosis, treatment monitoring, and therapeutic development in breast cancer.

5

CRISPR RNA-independent activation of Cas12a

Iwe, I. A.; Singh, S.; Guan, K.; Ocampo, R. F.; Ribeiro da Silva, S. J.; Wachholz Junior, D.; Emami, N.; Corsano, A.; Zeisler, I.; Bozovicar, K.; Wang, L.; Ham, D.; Cai, R.; Kelly, P.; Zayeni, R.; Nguyen, J.; Bayat, P.; Charania, M.; Palter, S.; Liu, F. X.; Shrestha, S.; Rayhan, A.; Wasney, G. A.; Mazzulli, T.; Green, A. A.; Li, Z.; Yao, S.; Hubbard, B. P.; Taylor, D. W.; Pardee, K.

2026-07-16 primary care research 10.64898/2026.07.14.26358058 medRxiv

Top 3%

0.3%

Show abstract

CRISPR-Cas12a nucleases are classically activated through CRISPR RNA (crRNA) guided and PAM-dependent target recognition, which together establish a canonical heteroduplex associated with nuclease activation. Here we identify a crRNA- and PAM-independent activation pathway for Cas12a that reveals previously unrecognized conformational plasticity within its nucleic acid recognition interface. We show that short RNAs can directly occupy the canonical crRNA-binding channel and trigger a catalytically competent trans cleavage state in the absence of PAM recognition or canonical R-loop formation. Biochemical assays indicate that short RNAs bind the crRNA-binding channel and are competitively displaced by cognate crRNA, consistent with binding at a conserved nucleic acid-binding interface. Cryo-electron microscopy (cryo-EM) further reveals that Cas12a maintains its global catalytic architecture while exhibiting loss of canonical PAM-dependent stabilization and increased flexibility of the RuvC lid, alongside accommodation of a noncanonical RNA-DNA hybrid with inverted polarity relative to the crRNA-target duplex. This crRNA-independent activation pathway enables programmable, amplification-free detection of DNA and RNA targets independent of canonical guide-mediated recognition. Together, these findings define an alternative activation geometry for Cas12a and expand models of Class 2 CRISPR-Cas effector activation beyond crRNA- and PAM-directed recognition.

6

Comprehensive molecular characterization of cutaneous squamous cell carcinoma reveals determinants of metastatic progression

Rentroia-Pacheco, B.; Sharma, H.; Pozza, L.; Traets, J. J. H.; Tandukar, B.; Steijlen, O. F. M.; Ruiter, R.; Cruz-Pacheco, N.; Huigh, D.; Van Hoeck, A.; Chen, Y.-T.; Infante, B.; Baskurt, D.; Arunachalam, V.; Eggermont, C. J.; Bas-Cristobal Menendez, A.; Nijsten, T.; van de Werken, H. J. G.; Mooyaart, A. L.; Bellomo, D.; Wakkee, M.; Shain, A. H.; Hollestein, L. M.

2026-07-20 oncology 10.64898/2026.07.17.26358051 medRxiv

Top 3%

0.2%

Show abstract

Cutaneous squamous cell carcinoma (cSCC) is the second most common form of cancer worldwide. While most cSCCs are not life-threatening, 2-5% of patients develop metastases. To better understand what causes some cSCCs to progress to metastatic disease, we assembled a nationwide cohort of 19,120 patients with clinico-pathologically annotated tumors linked to metastatic outcome. RNA-sequencing was performed on 378 tumors, and whole-exome sequencing on 147, with balanced numbers of tumors that progressed to metastatic disease (cases) and did not (controls). UV radiation was the dominant mutational signature with additional contributions from aging, APOBEC activity, and, in immunosuppressed patients, azathioprine exposure. We identified 38 genes under selection across a core set of signaling pathways. Gene expression clusters were primarily associated with the differentiation state of tumor cells and secondarily with the composition of the tumor microenvironment. Several mutational and transcriptional programs were associated with metastasis, including a dedifferentiated gene expression signature, activating mutations in the RAS signaling pathway, loss-of-function alterations in the SWI/SNF chromatin remodeling complex, and specific arm-level copy number alterations. A 23-gene expression signature was built to predict metastasis from primary cSCC tissue. The signature was validated in two independent cohorts (N=102 and 52), where it predicted metastasis independently of staging systems. Together, these findings provide the most detailed molecular portrait of cSCC to date and establish an assay for risk stratification suitable for clinical implementation.

7

In Silico Trial Simulation with Artificial Intelligence-Generated Synthetic Control Cohorts Reproduces Results of a Randomized Controlled Trial in Acute Myeloid Leukemia

Kumar Reddy, K.; Hahn, W.; Winter, S.; Roellig, C.; Mueller-Tidow, C.; Serve, H.; Baldus, C. D.; Fransecky, L.; Schliemann, C.; Burchert, A.; Schaefer-Eckart, K.; Kaufmann, M.; Schetelig, J.; Bornhaeuser, M.; Middeke, J. M.; Eckardt, J.-N.

2026-07-16 health informatics 10.64898/2026.07.15.26358123 medRxiv

Top 4%

0.2%

Show abstract

Rising costs, slow accrual and molecular substratification of cancers necessitate novel clinical trial designs. We demonstrate that artificial intelligence-generated synthetic patients can replace real controls to reproduce results of the SORAML trial. Using external multimodal data from 1,377 acute myeloid leukemia (AML) patients from previous trials and a real-world registry, we fine-tuned a tabular foundation model to generate synthetic patients, reproducing clinical and genetic features and outcome associations. Synthetic patients were then matched to the original SORAML intervention group using Cox risk scores, replacing the original control and reproducing the original trial result with near-identical median event-free survival (EFS) and treatment effect (original hazard ratio [HR] 0.64, 95%-confidence interval [CI] 0.47-0.87, p=0.004; with synthetic control HR 0.66, 95%-CI 0.48-0.90, p=0.009). Our findings demonstrate that AI-generated synthetic patients can serve as statistically rigorous controls supporting novel trial designs.

8

Proteogenomic mapping of multimorbidity identifies C1R linking coronary artery disease and dementia

Li, L.; Tang, Z.; Zhong, Z.; Geng, T.; Guo, Y.; Liao, Y.; Demirkan, A.; Bowden, J.; Bragg, F.; Pan, A.; Sun, X.; Liu, J.; Liu, G.; Liu, J.

2026-07-16 genetic and genomic medicine 10.64898/2026.07.14.26358022 medRxiv

Top 4%

0.1%

Show abstract

Multimorbidity is highly prevalent in ageing populations, yet its shared molecular basis remains poorly defined, limiting the development of therapies that target multiple conditions. We systematically integrated measurements of 1,954 circulating proteins from 54,219 individuals in discovery and 35,559 in replication, focusing on ten common age-related diseases: coronary artery disease, chronic kidney disease, chronic obstructive pulmonary disease, dementia, heart failure, major depressive disorder, osteoarthritis, Parkinson's disease, stroke, and type 2 diabetes. Coronary artery disease emerged as a central condition in the multimorbidity network, sharing circulating protein signatures with seven other diseases. Through genetic causal-inference analyses, we identified 40 circulating proteins with cross-disease relevance, of which four were further supported by colocalization of genetic variant associations. Among these, complement C1r, encoded by C1R, emerged as a key link between coronary artery disease and dementia, supported by independent colocalization evidence (PP.H4 = 0.86). Phenome-wide association analyses of C1R variants suggested that this signal was not driven by widespread unrelated genetic effects, but instead may reflect a more specific contribution to coronary artery disease-dementia pathogenesis. In vitro experiments further suggested that fibroblast-derived C1R promotes endothelial inflammation and neuronal apoptosis, providing mechanistic plausibility. Together, these findings position C1R as a biologically plausible and therapeutically relevant molecular link between coronary artery disease and dementia.

9

Nationwide Mpox Genomic Surveillance Reveals Clade Ib Introductions, APOBEC3-Driven Evolution, and Terminal Deletions

Brochu, H. N.; Shi, Q.; Song, K.; Zhang, Q.; Munroe, J.; Harris, N. J.; Britt, N.; Zeng, Q.; Kapuria, K.; Chappell, J.; Norvell, B. M.; Peavy, L.; Williams, J. D.; Harris, A. B.; Chaitram, J.; Hutson, C. L.; Deng, J.; McGrath, D.; Boles, D.; Dale, S. E.; Gigante, C. M.; Iyer, L. K.

2026-07-17 infectious diseases 10.64898/2026.07.15.26357894 medRxiv

Top 4%

0.1%

Show abstract

Background The 2022-2023 global mpox outbreak highlighted the critical need for robust genomic surveillance capabilities to track mpox virus (MPXV) evolution and transmission dynamics. Methods Building upon our established SARS-CoV-2 sequencing infrastructure, we implemented a Molecular Loop probe-based long-read sequencing approach using Pacific Biosciences Sequel II technology for comprehensive MPXV genomic surveillance across the United States (US). From August 2024 to June 2025, we generated 326 high-quality whole genome sequences from residual mpox-positive clinical specimens collected by Labcorp across all 10 US Department of Health and Human Services regions. Results Our analysis identified two samples containing clade Ib MPXV in January and June 2025 and captured shifting trends in clade IIb diversity, with 13 distinct lineages observed. We also identified multiple instances of large (~1.6-17.6kb) deletions proximal to the inverted terminal repeats in clade IIb genomes. APOBEC3 mutation analysis indicated substantial evidence of human-to-human transmission among both clades. Further, we observed significantly higher APOBEC3-associated SNPs per kilobase (P<0.001) in clade IIb genomic variable regions relative to their central conserved region. Our assay exhibited strong reproducibility across biological replicates from individual patients and accuracy was confirmed via parallel sequencing of select specimens by US Centers for Disease Control and Prevention (CDC) using metagenomic sequencing. We also demonstrated via custom simulation that our assay discriminates all known MPXV clades and lineages, including those we have not observed in the US. Conclusions Our integrated nationwide surveillance system facilitates real-time genomic tracking of outbreak evolution, with demonstrated capacity across SARS-CoV-2 and MPXV, positioning this platform for rapid deployment during future pathogen emergence.

10

Multimodal gene prioritization reveals nonlinear regulatory architecture in childhood-onset asthma

Huang, N.; Ragsac, M. F.; Gui, X.; Tantisira, K. G.; Amariuta, T.

2026-07-16 genetic and genomic medicine 10.64898/2026.07.14.26357983 medRxiv

Top 5%

0.1%

Show abstract

Asthma is a heritable complex disease that disproportionately burdens minority and admixed populations in the US. However, the causal genes and regulatory mechanisms governing inherited risk remain largely unresolved. We performed a European-ancestry meta-analysis of 141,894 cases and 1,361,846 controls drawn from the Trans-national Asthma Genetic Consortium (TAGC) and Global Biobank Meta-analysis Initiative (GBMI), yielding an estimated h2SNP of 0.056 (SE = 0.0038) and 275 independently associated loci. To enhance mechanistic inference beyond variant-level associations, we developed a multimodal framework to predict asthma risk integrating GWAS summary statistics, bulk tissue expression quantitative trait loci (eQTL) data from the Genotype-Tissue Expression (GTEx) project, and single-cell gene eQTL data from the OneK1K Project. We performed transcriptome-wide association studies (TWAS) and subsequently applied probabilistic fine-mapping with FOCUS to prioritize putative causal genes expressed in bulk tissues and higher resolution immune cell populations. Fine-mapping asthma-associated genes implicated barrier-immune and metabolic-endocrine tissues alongside adaptive T-cell subsets as the primary mediators of asthma genetic risk, resolving canonical CD4+ Th2 effector genes including IL1RL1, TSLP, STAT6, and GATA3. Using these prioritized genes, we constructed a polygenic transcriptome risk score (PTRS) using random forest to integrate gene-level effects across critical tissues and cell types. Evaluated in two ancestrally distinct pediatric asthma cohorts, the Childhood Asthma Management Program (CAMP) and the Genetics of Asthma in Costa Rica Study (GACRS), our PTRS demonstrated improved transferability over the standard variant-level and gene-level baseline models. While modest common variant heritability limits the discriminative power of our models, we estimated a theoretical maximum achievable area under the receiver operating characteristic (AUROC) curve of 0.64. Our integrative nonlinear model of PRS-CSx and cross-modal (bulk tissue and single cell) FOCUS PTRS resulted in the best cross-cohort performance (CAMP AUC = 0.632, sd = 0.04, 3.55 case/control odds ratio in top vs. bottom quartiles), representing an increase of +0.118 AUC over PRS-CSx, +0.067 AUC over tissue-specific TWAS pruning and thresholding, and +0.041 AUC over cell-type-specific FOCUS PTRS. Our results demonstrate that modeling nonlinear interactions between variant- and gene-level effects across both bulk tissue and single cell eQTL data improves our ability to determine high-risk individuals and to explain the likely mechanisms driving genetic susceptibility of childhood-onset asthma.

11

Complex intra-host SARS-CoV-2 evolution following monoclonal antibody pre-exposure prophylaxis

Kamelian, K.; Pascall, D. J.; Cheng, M. T. K.; Meng, B.; Altaf, M.; Morse, R. M.; Aggio, J. B.; Egan, D. J. S.; Chen-Xu, M.; Trivioli, G.; Sutton, B.; Richter, A.; Gonzalez-Vazquez, L. D.; Cormie, C.; Kemp, S.; Yeadon, R.; Hyatt, B.; Wong, A.; Thesin Pelamkulangara, N.; Fraser, E.; McCarthy, B.; Novaes, F.; Stott, S.; Galvin, A.; Bellis, K. L.; De Angelis, D.; Harrison, E. M.; Martin, D.; Smith, R. M.; Gupta, R. K.

2026-07-17 infectious diseases 10.64898/2026.07.14.26356329 medRxiv

Top 6%

0.1%

Show abstract

Background: Monoclonal antibodies have emerged as a prophylactic strategy to prevent symptomatic SARS-CoV-2 infection in immunocompromised individuals. However, the evolutionary and clinical implications of breakthrough infections under this regime remain unclear. Methods: A male in their 80s with a haematological/oncological diagnosis received a 2000 mg intravenous infusion of sotrovimab in March 2023 and was diagnosed with COVID-19 by RT-qPCR from a nasopharyngeal swab in August 2023. Weekly samples (n=24) were collected through February 2024 (171 days). All samples underwent whole-genome sequencing, with select mutations subjected to functional assessment. Findings: Sequencing identified the GE.1 lineage at all timepoints. An intra-host recombination event in ORF1ab (positions 8942-12458) was detected prior to 23 weeks post-detection, followed by a 14-fold increase in viral load (7.42e+06 to 1.00e+08 RNA copies/mL) and a marked shift in the viral population. E340D, a sotrovimab resistance mutation, was detected at low abundance (46%) within the first week post-infection, fluctuated over time, and was nearly fixed by week 15 (107 days) post-detection. We assessed five spike mutations - V36M, S98F, and V213G in the N-terminal domain, Y505P in the receptor-binding domain, and P681Q near the S1/S2 cleavage site - and additionally evaluated the impact of E340D. V36M conferred the highest infectivity across all cell lines, with the most significant effect in low-TMPRSS2 cells. While all mutations showed enhanced infectivity with the addition of E340D, the effect was most pronounced in mutations with lower baseline infectivity. The addition of E340D significantly decreased relative neutralizing titres for V36M, S98F, and V213G, enabling escape from neutralizing antibodies in XBB-responsive individuals, illustrating an enhanced phenotypic advantage. Patient neutralizing activity was absent pre-sotrovimab, and sotrovimab-induced neutralization was further compromised by selection of E340D. Interpretation: Sotrovimab pre-exposure prophylaxis in an immunocompromised patient did not prevent SARS-CoV-2 infection, and selected for resistant mutation E340D, with unexpected fitness consequences across non-receptor binding domain spike regions.

12

Neonatal admission as a marker of risk for poor educational attainment and special educational needs in children aged 5-11 years

John, A.; Pike, C.; Olga, L.; Sovio, U.; Wong, H. S.; Smith, G. C.; Aiken, C.

2026-07-17 pediatrics 10.64898/2026.07.15.26358132 medRxiv

Top 6%

0.0%

Show abstract

Background: Children born prematurely (before 37 weeks) or admitted to the neonatal unit (NNU) are at increased risk of adverse long-term physical health outcomes. It is also recognised that there is an association with later academic performance and special educational needs, however it is not clear whether these broad risk factors could be used as stand-alone heuristics to identify children who may benefit from additional support in educational settings. We aimed to examine the associations between neonatal unit (NNU) admission and educational attainment in mid-childhood. Methods and Findings: Pregnancy data from a prospective birth cohort (Pregnancy Outcome Prediction Study, Cambridge, United Kingdom, 2008-2012) were linked to national educational outcomes (Department for Education, United Kingdom). Multivariable regression models adjusted for maternal, child, and socioeconomic factors were used to evaluate associations between (i) all NNU admissions, (ii) at term NNU admissions >48 hours, (iii) preterm birth without ongoing physical health needs, and educational outcomes at ages 5-11 years. Children who required any NNU care were more likely not to meet expected educational standards across multiple ages and domains in early and mid-childhood: age 5 early year foundation (aOR 1.64, 95% CI 1.19-2.27, p=0.003), phonics at age 6 (aOR 2.43, 95% CI 1.72-3.57, p<0.001), and at age 7 (here assessments were divided into multiple domains): reading (aOR 1.67, 95% CI 1.18-2.38, p=0.004), writing (aOR 1.72, 95% CI 1.25-2.38, p<0.001), mathematics (aOR 1.56, 95% CI 1.09-2.22, p=0.020), and science (aOR 1.85, 95% CI 1.22-2.78, p=0.003). Similar patterns were observed among both at term-born infants who stayed >48hrs in NNU (phonics assessment at age 6 aOR 2.26, 95% CI 1.51-3.36, p<0.001) and in children born preterm without long-term physical health sequelae (phonics assessment at age 6 aOR 3.07, 95% CI 1.96-4.81, p<0.001). These associations were robust to adjustment for demographic, perinatal, and socio-economic factors. By age 11, differences in academic attainment were attenuated and no longer clearly distinguishable across all exposure groups. However, there was an increased likelihood of special educational needs (SEN) at age 11 associated with any NNU admission (aOR 1.78, 95% CI 1.15-2.73, p=0.009), at term NNU admission for >48hrs (aOR 1.88, 95% CI 1.19-3.00, p=0.007), and children born preterm without long-term physical health sequelae (aOR 1.50, 95% CI 1.00-2.25, p=0.049). Predictive performance of any NNU admission for SEN at age 11 was moderate (AUC 0.70, 95% CI: 1.14-2.65, p=0.010), with balanced sensitivity and specificity and high negative predictive value. Conclusions: NNU admission, for both term and preterm infants, is associated with poorer educational outcomes and an increased likelihood of special educational needs in mid-childhood.

13

FootNet: A Multi-View Smartphone Dataset and Four-Model Benchmark for Clinical Foot Segmentation

Vijay, A.; Prabhune, A.; Srihari, V. R.; Rayampalli, A.

2026-07-17 health informatics 10.64898/2026.07.15.26358117 medRxiv

Top 6%

0.0%

Show abstract

We present FootNet, a 453-image multi-view smartphone foot dataset for binary foot segmentation, with expertannotated masks across six anatomical views (dorsal, medial, and plantar, both left and right). We benchmark four segmentation models under a controlled protocol: U-Net with a MobileNetV2 encoder achieves the best performance (IoU 0.9268, Dice 0.9608, 95 % CI [0.9209, 0.9320]); DeepLabV3 with MobileNetV3-Large scores IoU 0.8984 (Dice 0.9449); UNet++ with MobileNetV2 scores IoU 0.8913 (Dice 0.9391); and SAM ViT-B with oracle boundingbox prompt scores IoU 0.9219 on the matched 191-image subset. Bonferroni-corrected Wilcoxon signed-rank tests (k = 6 comparisons) show U-Net significantly outperforms DeepLab (p < 0.001, r = 0.638) and SAM ViT-B with oracle boundingbox (p = 0.005, r = 0.202); UNet++ does not significantly differ from DeepLab (p = 0.062). Connected-component postprocessing yields negligible benefit (mean {triangleup}IoU = +0.0003, 12 of 453 images improved). The extended dataset is available upon request

14

Efficient stochastic epidemic simulation via the Sellke construction

van Boven, M.; Bootsma, M. C.

2026-07-17 epidemiology 10.64898/2026.07.16.26358219 medRxiv

Top 6%

0.0%

Show abstract

Stochastic epidemic models are a cornerstone of infectious disease epidemiology and are often used to study intervention scenarios. However, large run-to-run variability can make intervention effects difficult to estimate precisely. We revisit the epidemic Sellke construction, which assigns each individual an infection threshold for the cumulative infection hazard such that, conditional on the thresholds, the epidemic trajectory becomes deterministic. This enables coupling of simulations with and without an intervention, yielding low-variance effect estimates even when outcomes such as final size or peak incidence vary widely between runs. We develop an exact, event-driven implementation that maintains infection and recovery events in priority queues. Cumulative infection-hazard updates require O(log N) time per event, yielding overall complexity O(Elog N) for E events in a population of size N. The implementation achieves computational performance comparable to the classical Gillespie algorithm while naturally accommodating non-Markovian infectious periods and complex infectiousness profiles. We illustrate the approach using distance-dependent spread of avian influenza between poultry farms in the Netherlands and a multilayer population with households, schools, and workplaces. In both examples, coupling enables efficient within-run comparisons of intervention scenarios across stochastic realisations.

15

General Practice Perspectives on Post-Infection Conditions: Scoping Review and UK Survey

Aung, K. W.; Scuffell, J.; Podlasek, A.; Engamba, S.; Jones, F.; Edwards, A.; Chew-Graham, C. A.; Sanyaolu, L.; Busse-Morris, M.

2026-07-17 primary care research 10.64898/2026.07.15.26358157 medRxiv

Top 6%

0.0%

Show abstract

Background Post-infection conditions (PICs), such as Long Covid, are associated with heterogeneous, fluctuating symptoms that profoundly affect daily functioning. Despite moderate-certainty evidence from the NIHR-funded LISTEN trial (COV-LT2-0009) that personalised self management support improves outcomes and may reduce societal and economic impacts of Long Covid, many people living with PICs still receive condition-specific services, generic advice, or stand-alone digital tools that do not address their complex needs. Aim To map care approaches in general practice and synthesise UK evidence for PIC management. Design and setting Scoping review and online survey. Method A two-phase study was conducted: (1) a scoping review of UK evidence on PIC management in general practice; and (2) a supplementary online survey of practitioners working in UK general practice to provide contextual insights. Results The scoping review identified 32 studies focused on Long Covid. One study included a comparator group (ME/CFS). Study populations were predominantly white ethnicity and female. Evidence for non-Covid PICs in UK general practice was largely absent. The supplementary survey (n=46) provided preliminary practice-level insights. Healthcare practitioners reported varied PIC presentations, diagnostic uncertainty, limited referral pathways, inequitable access, and low confidence in managing PICs. Conclusion Evidence informing PIC management in UK general practice remains predominantly Long Covid-focused and may not reflect the range of PICs encountered in practice. While survey findings are preliminary and require confirmation in larger samples, they highlight uncertainty around PIC management. Further research is needed to evaluate whether existing Long Covid pathways should be expanded or complemented by broader PIC models. Keywords general practice; Long Covid; self-management; post-viral syndromes

16

Temporal relationships between distress and pain in people living with HIV

Arendse, G.; Kamerman, P.; Wadley, A.; Edwards, R. R.; Joska, J.; Parker, R.; Madden, V. J.

2026-07-17 primary care research 10.64898/2026.07.15.26358133 medRxiv

Top 6%

0.0%

Show abstract

Objective: There is a bidirectional relationship between emotional distress and pain. However, this relationship is understudied in people with HIV in low-resource settings. This study sought to describe the temporal relationship between emotional distress and pain in people with HIV. Design: Longitudinal observational study. Methods: Participants with virally suppressed HIV, reporting either no pain or persistent pain at baseline, provided weekly remote ratings of distress, worst pain, and average pain using 0-10 visual analogue scales. Within-individual fluctuations in distress and pain were visualised over time. Group-level correlations were determined using Spearman's correlation tests. Cumulative link mixed models assessed whether distress and pain each predicted the other in the following week. Results: 72 participants provided responses over 49 weeks. The participants had a median (IQR) age of 43 (37-51) years, 63% (n=45) were unemployed and most were females (n=51;71%). Distress and pain fluctuated concurrently within individuals: distress was positively correlated with worst pain ({rho}=0.66, 95% CI= 0.60-0.72, p<0.001) and average pain ({rho}=0.70, 95% CI=0.64-0.75, p<0.001) intensity within the same week. Worst pain (OR=1.42, 95% CI=1.17-1.71, p<0.001) and average pain (OR=1.43, 95% CI=1.20-1.71, p<0.001) intensity both predicted distress in the next week. Distress predicted worst pain intensity (OR=1.25, 95% CI=1.07-1.46, p=0.023) but not average pain intensity (OR=1.19, 95% CI=1.01-1.40, p=0.152) in the next week. Conclusions: The temporal relationship between distress and worst pain intensity was bidirectional, whereas distress did not temporally predict average pain intensity. Both pain and emotional distress should receive attention from HIV research and clinical care in low-resource settings.

17

Genome-Wide Association Studies and Deep-Learning Functional Annotation of Opioid Use Disorder across Three Ancestries in the All of Us Research Program

Gu, S.; Petrovitch, D.; Hall, O. T.; Lambert, J. W.; Kember, R. L.; Nahid, N. A.; Ma, Q.; Sprague, J. E.; McDonough, C. W.; Johnson, J. A.

2026-07-17 addiction medicine 10.64898/2026.07.15.26358096 medRxiv

Top 6%

0.0%

Show abstract

Background: Opioid use disorder (OUD) is heritable, yet most genome-wide association studies (GWAS) have focused on European populations, leaving the genetic architecture of OUD in non-European populations underexplored. Methods: We conducted GWAS of OUD across three ancestries using electronic health records and genomic data from 52,357 All of Us Research Program participants (8,912 cases; 43,445 matched opioid-exposed controls; 48.5% female). Participants were stratified into European (EUR), African (AFR), and Admixed American (AMR) ancestry groups for logistic regression GWAS, with independent replication in the Million Veteran Program. We then applied the deep-learning model AlphaGenome to predict the tissue-specific transcriptomic and splicing consequences of top risk variants across 13 reward-pathway brain regions. Results: We identified and replicated a novel DDX6 risk locus, alongside established OPRM1 and FURIN signals. AlphaGenome predicted the DDX6 regulatory allele downregulates the stress-resistance gene FOXR1 in the nucleus accumbens, while the protective OPRM1 variant (rs1799971) upregulates OPRM1 expression across reward networks. Other signals of interest included IL6R and SHISA9 (EUR); GHR (AFR); and ASTN2 (AMR). Conclusions: This study identifies DDX6 as a novel OUD risk locus, replicates associations with OPRM1 and FURIN, and highlights biologically plausible ancestry-specific signals in AFR and AMR populations. We also replicated top variants in an independent population. Finally, integrating GWAS with deep-learning annotations provides specific, localized biological hypotheses to guide future experimental validation and targeted therapeutics.

18

Trends and variations in Lithium usage across care settings in England between 2015-2024

Schiffer, H.; Fisher, L.; Curtis, H. J.; Wood, C.; Brown, A. D.; Bacon, S. C.; Croker, R.; Goldacre, B.; MacKenna, B.; Speed, V.; Macdonald, O.

2026-07-17 psychiatry and clinical psychology 10.64898/2026.07.15.26357641 medRxiv

Top 6%

0.0%

Show abstract

Lithium has been the gold standard for the treatment and prevention of relapse in bipolar disorder for over 60 years. Guidance from the National Institute for Health and Clinical Excellence states explicitly to 'offer lithium as a first-line, long-term pharmacological treatment for bipolar disorder'. Yet, in the last two decades its use has been in decline with clinicians favouring anticonvulsants or antipsychotics when treating this condition. In this study, we have used three openly available datasets containing prescribing data from primary and secondary care to explore trends in the use of lithium in England, showing both regional and temporal variance between 2015-2024. We have shown that lithium use declined in primary care by 20.9% in the last ten years (2015-2024) and 10.9% overall in the last five years (2019 to 2025). We have also shown how there is some regional variation in the source of lithium for patients, although the vast majority is prescribed in primary care. Further research into clinical behaviour is needed to understand what is driving the decrease in lithium usage, and what barriers and enablers may influence its use across the country.

19

Human GPR174 deficiency drives polyclonal lymphoproliferative disease via defects in T cell function

Huang, Y.-H.; Arana, K.; Rachimi, S.; Tam, H.; Spegarova, J. S.; Engelhardt, K. R.; Griffin, H.; Mee, M.; Miano, M.; Raggi, F.; Grossi, A.; Rusmini, M.; Ceccherini, I.; Dell'Orso, G.; Ferro, J.; Giarratana, M. C.; Pillai, V.; Banka, S.; Garcez, T.; Briggs, T. A.; Mellouli, F.; von Hardenberg, S.; Beier, R.; Auber, B.; Baumann, U.; Tawamie, H.; Behrens, E.; Oldridge, D. A.; Cabrera, E. C.; Xu, Y.; Ouyang, S.; Hambleton, S.; Romberg, N.; Cyster, J. G.

2026-07-17 rheumatology 10.64898/2026.07.14.26357774 medRxiv

Top 6%

0.0%

Show abstract

The X-linked G-protein coupled receptor GPR174 is highly expressed in T and B lymphocytes and has immunoregulatory roles in mice, but its function in humans is unknown. We describe a cohort of six individuals who have function-disrupting variants in GPR174 and a clinical phenotype of lymphadenopathy and autoimmunity. Histological analysis of two patient lymph nodes revealed necrotizing lymphadenitis and lymphoproliferation resembling Kikuchi-Fujimoto disease. In-depth analysis of three patients and related carriers revealed overaccumulation of CD8 terminally differentiated effector memory cells re-expressing CD45RA (TEMRA). Patient cells and GPR174-deficient CD8 T cells generated from controls showed less repression of proliferation by the GPR174 ligand lysophosphatidylserine (lysoPS) and an effector-biased gene expression program. GPR174-deficient CD4 T cells were resistant to lysoPS-mediated suppression of IL2 production. In mice, chronic viral infection led to over-accumulation of GPR174-deficient effector CD8 T cells. We describe an inborn error of immunity associated with dysregulated lymphocyte responses that we propose predisposes to exaggerated lymphoproliferation and autoimmunity following viral infection.

20

Comparative Efficacy of Vancomycin and Fidaxomicin Regimens for the Prevention of Recurrent Clostridioides difficile Infection: A Systematic Review and Network Meta-Analysis of Randomized Controlled Trials

Prosty, C.; Butler-Laporte, G.; Brophy, J.; Frenette, C.; Loo, V.; Coburn, B.; Hota, S.; Longtin, Y.; Kong, L.; Muller, M.; Steiner, T.; Valiquette, L.; Daneman, N.; Daley, P.; Nott, C.; MacFadden, D. R.; Kandel, C.; Chen, Y.; Perez- Patrigeon, S.; Lee, T. C.; McDonald, E.

2026-07-17 infectious diseases 10.64898/2026.07.14.26358112 medRxiv

Top 6%

0.0%

Show abstract

Background and Aims The optimal treatment for first episodes and first recurrences of Clostridioides difficile infections (CDI) is unknown and there is emerging evidence for pulse and taper (P-T) regimens. Therefore, we sought to estimate the relative efficacy of treatment options. Methods MEDLINE and CENTRAL were searched from database inception to May 21, 2025 and unpublished conference abstracts were searched from recent infectious disease conferences. RCTs on the treatment of first episodes or first recurrences of CDI comparing fixed-dose or P-T regimens of fidaxomicin or vancomycin were included. The primary and secondary outcomes were 40- and 56-day CDI recurrence, respectively. A random-effects network meta-analysis on the risk ratio (RR) scale was conducted using a standard regimen (10-14 days) of vancomycin as the comparator. Treatments were ranked using the surface under the cumulative ranking curve (SUCRA). Results 8 RCTs were included comprising a total of 2181 patients. For 40-day recurrence, fidaxomicin P-T had the highest probability of ranking best (RR=0.10, 95%Confidence Interval [95%CI]=0.10-0.49, SUCRA=1.00), followed by vancomycin P-T (RR=0.49, 95%CI=0.32-0.76, SUCRA=0.61), fixed-dose fidaxomicin (RR=0.61, 95%CI=0.49-0.76, SUCRA=0.39), and, finally, fixed-dose of vancomycin (SUCRA=0.00). The treatments ranked in the same order for 56-day recurrence, though only 3 RCTs reported on this timepoint. Conclusion Vancomycin P-T, fidaxomicin P-T, and fixed-dose fidaxomicin were all superior to a fixed-dose vancomycin. Head-to-head comparative effectiveness RCTs are needed to quantify their relative effect sizes of and impact on long-term prevention of recurrent CDI.